AITopics

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Neural Information Processing SystemsFeb-9-2026, 23:00:43 GMT

77e5109bdf9f337e11e004c22c8ac89d-Paper-Conference.pdf

artificial intelligence, conformation, machine learning, (15 more...)

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Alameda County > Oakland (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.93)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Jalilifard, Amir, Rocha, Anderson de Rezende, Raimundo, Marcos Medeiros

Reasoning Distillation and Structural Alignment for Improved Code Generation

arXiv.org Artificial IntelligenceOct-21-2025

Effective code generation with language models hinges on two critical factors: accurately understanding the intent of the prompt and generating code that applies algorithmic reasoning to produce correct solutions capable of passing diverse test cases while adhering to the syntax of the target programming language. Unlike other language tasks, code generation requires more than accurate token prediction; it demands comprehension of solution-level and structural relationships rather than merely generating the most likely tokens. very large language model (VLLM) are capable of generating detailed steps toward the correct solution of complex tasks where reasoning is crucial in solving the problem. Such reasoning capabilities may be absent in smaller language models. Therefore, in this work, we distill the reasoning capabilities of a VLLM into a smaller, more efficient model that is faster and cheaper to deploy. Our approach trains the model to emulate the reasoning and problem-solving abilities of the VLLM by learning to identify correct solution pathways and establishing a structural correspondence between problem definitions and potential solutions through a novel method of structure-aware loss optimization. This enables the model to transcend token-level generation and to deeply grasp the overarching structure of solutions for given problems. Experimental results show that our fine-tuned model, developed through a cheap and simple to implement process, significantly outperforms our baseline model in terms of pass@1, average data flow, and average syntax match metrics across the MBPP, MBPP Plus, and HumanEval benchmarks.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

2510.17598

Genre:

Research Report > New Finding (0.66)
Research Report > Promising Solution (0.54)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

arXiv.org Artificial IntelligenceSep-23-2025

Canonical Representations of Markovian Structural Causal Models: A Framework for Counterfactual Reasoning

de Lara, Lucas

Counterfactual reasoning aims at answering contrary-to-fact questions like "Would have Alice recovered had she taken aspirin?" and corresponds to the most fine-grained layer of causation. Critically, while many counterfactual statements cannot be falsified--even by randomized experiments--they underpin fundamental concepts like individual-wise fairness. Therefore, providing models to formalize and implement counterfactual beliefs remains a fundamental scientific problem. In the Markovian setting of Pearl's causal framework, we propose an alternative approach to structural causal models to represent counterfactuals compatible with a given causal graphical model. More precisely, we introduce counterfactual models, also called canonical representations of structural causal models. They enable analysts to choose a counterfactual assumption via random-process probability distributions with preassigned marginals and characterize the counterfactual equivalence class of structural causal models. Using these representations, we present a normalization procedure to disentangle the (arbitrary and unfalsifiable) counterfactual choice from the (typically testable) interventional constraints. In contrast to structural causal models, this allows to implement many counterfactual assumptions while preserving interventional knowledge, and does not require any estimation step at the individual-counterfactual layer: only to make a choice. Finally, we illustrate the specific role of counterfactuals in causality and the benefits of our approach on theoretical and numerical examples.

artificial intelligence, counterfactual conception, machine learning, (15 more...)

2507.1637

Country:

North America > United States (0.45)
Europe > France (0.27)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.86)

Industry:

Health & Medicine > Consumer Health (0.48)
Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Neural Information Processing SystemsAug-16-2025, 02:50:54 GMT

77e5109bdf9f337e11e004c22c8ac89d-Paper-Conference.pdf

artificial intelligence, conformation, machine learning, (15 more...)

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Alameda County > Oakland (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.93)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Artificial IntelligenceJul-23-2025

Physical models realizing the transformer architecture of large language models

Chen, Zeqian

The introduction of the transformer architecture in 2017 marked the most striking advancement in natural language processing. The transformer is a model architecture relying entirely on an attention mechanism to draw global dependencies between input and output. However, we believe there is a gap in our theoretical understanding of what the transformer is, and how it works physically. From a physical perspective on modern chips, such as those chips under 28nm, modern intelligent machines should be regarded as open quantum systems beyond conventional statistical systems. Thereby, in this paper, we construct physical models realizing large language models based on a transformer architecture as open quantum systems in the Fock space over the Hilbert space of tokens. Our physical models underlie the transformer architecture for large language models.

large language model, machine learning, transformer architecture, (15 more...)

2507.13354

Country:

Europe > United Kingdom > England (0.14)
Asia > China (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Freeborn, David Peter Wallis

Compositional Understanding in Signaling Games

arXiv.org Artificial IntelligenceJul-22-2025

Even when the signalers send compositional messages, the receivers do not interpret them compositionally. When information from one message component is lost or forgotten, the information from other components is also erased. In this paper I construct signaling game models in which genuine compositional understanding evolves. I present two new models: a minimalist receiver who only learns from the atomic messages of a signal, and a generalist receiver who learns from all of the available information. These models are in many ways simpler than previous alternatives, and allow the receivers to learn from the atomic components of messages.

artificial intelligence, information, machine learning, (20 more...)

2507.15706

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

arXiv.org Machine LearningJan-30-2025

Unfaithful Probability Distributions in Binary Triple of Causality Directed Acyclic Graph

Liu, Jingwei

Faithfulness is the foundation of probability distribution and graph in causal discovery and causal inference. In this paper, several unfaithful probability distribution examples are constructed in three--vertices binary causality directed acyclic graph (DAG) structure, which are not faithful to causal DAGs described in J.M.,Robins,et al. Uniform consistency in causal inference. Biometrika (2003),90(3): 491--515. And the general unfaithful probability distribution with multiple independence and conditional independence in binary triple causal DAG is given.

faithful, independence, probability distribution, (13 more...)

arXiv.org Machine Learning

2501.18337

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Europe > Czechia > Prague (0.05)
(7 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)

arXiv.org Artificial IntelligenceJan-23-2025

A Data-driven Dynamic Temporal Correlation Modeling Framework for Renewable Energy Scenario Generation

Dong, Xiaochong, Liu, Yilin, Zhang, Xuemin, Mei, Shengwei

Renewable energy power is influenced by the atmospheric system, which exhibits nonlinear and time-varying features. To address this, a dynamic temporal correlation modeling framework is proposed for renewable energy scenario generation. A novel decoupled mapping path is employed for joint probability distribution modeling, formulating regression tasks for both marginal distributions and the correlation structure using proper scoring rules to ensure the rationality of the modeling process. The scenario generation process is divided into two stages. Firstly, the dynamic correlation network models temporal correlations based on a dynamic covariance matrix, capturing the time-varying features of renewable energy while enhancing the interpretability of the black-box model. Secondly, the implicit quantile network models the marginal quantile function in a nonparametric, continuous manner, enabling scenario generation through marginal inverse sampling. Experimental results demonstrate that the proposed dynamic correlation quantile network outperforms state-of-the-art methods in quantifying uncertainty and capturing dynamic correlation for short-term renewable energy scenario generation.

artificial intelligence, correlation, machine learning, (18 more...)

2501.14233

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
Asia > China > Beijing > Beijing (0.05)

Genre: Research Report (1.00)

Industry: Energy > Renewable > Wind (0.96)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Modeling & Simulation (0.89)
(2 more...)

Neural Information Processing SystemsJan-21-2025, 20:18:20 GMT

Reviews: Probabilistic Logic Neural Networks for Reasoning

This paper solves the task of knowledge base completion i.e. filling the missing relations between two entities by combining Statistical Relational Model like Markov Logic, and knowledge graph embedding method like TransE. Authors define a set of rules to be used in MLNs and then define a joint probability distribution over the observed and hidden triplets. Similarly, they define a joint probability distribution using KGE approaches (specifically they chose transE model). Then they employ the variational EM algorithm to learn the MLN weights and finally predicting the probabilities of hidden triplets. Originality: I really liked the paper, and enjoyed thoroughly reading it.

joint probability distribution, probabilistic logic neural network, reasoning, (6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.59)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.42)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.39)